首页> 外文OA文献 >Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis.
【2h】

Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis.

机译:统一高通量测序数据集的分析:通过成分数据分析表征RNA-seq,16S rRNA基因测序和选择性生长实验。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BACKGROUND: Experimental designs that take advantage of high-throughput sequencing to generate datasets include RNA sequencing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), sequencing of 16S rRNA gene fragments, metagenomic analysis and selective growth experiments. In each case the underlying data are similar and are composed of counts of sequencing reads mapped to a large number of features in each sample. Despite this underlying similarity, the data analysis methods used for these experimental designs are all different, and do not translate across experiments. Alternative methods have been developed in the physical and geological sciences that treat similar data as compositions. Compositional data analysis methods transform the data to relative abundances with the result that the analyses are more robust and reproducible.RESULTS: Data from an in vitro selective growth experiment, an RNA-seq experiment and the Human Microbiome Project 16S rRNA gene abundance dataset were examined by ALDEx2, a compositional data analysis tool that uses Bayesian methods to infer technical and statistical error. The ALDEx2 approach is shown to be suitable for all three types of data: it correctly identifies both the direction and differential abundance of features in the differential growth experiment, it identifies a substantially similar set of differentially expressed genes in the RNA-seq dataset as the leading tools and it identifies as differential the taxa that distinguish the tongue dorsum and buccal mucosa in the Human Microbiome Project dataset. The design of ALDEx2 reduces the number of false positive identifications that result from datasets composed of many features in few samples.CONCLUSION: Statistical analysis of high-throughput sequencing datasets composed of per feature counts showed that the ALDEx2 R package is a simple and robust tool, which can be applied to RNA-seq, 16S rRNA gene sequencing and differential growth datasets, and by extension to other techniques that use a similar approach.
机译:背景:利用高通量测序来生成数据集的实验设计包括RNA测序(RNA-seq),染色质免疫沉淀测序(ChIP-seq),16S rRNA基因片段测序,宏基因组分析和选择性生长实验。在每种情况下,基础数据都是相似的,并且由映射到每个样本中大量特征的测序读段计数组成。尽管存在基本的相似性,但是用于这些实验设计的数据分析方法都是不同的,并且不会跨实验进行转换。在物理和地质科学领域已经开发出了替代方法,将相似的数据视为成分。成分数据分析方法将数据转换为相对丰度,结果分析更加可靠且可重复。结果:检查了来自体外选择性生长实验,RNA-seq实验和人类微生物组计划16S rRNA基因丰度数据集的数据由ALDEx2提供,这是一种成分数据分析工具,使用贝叶斯方法来推断技术和统计误差。 ALDEx2方法显示适用于所有三种类型的数据:它可以正确识别差异生长实验中特征的方向和差异丰度,可以识别RNA-seq数据集中与差异表达基因基本相似的一组基因,例如领先的工具,它可以识别人类微生物组计划数据集中区分舌背和颊粘膜的类群。 ALDEx2的设计减少了由少量样本中许多特征组成的数据集导致的假阳性识别的数量。结论:对按特征计数组成的高通量测序数据集进行统计分析表明,ALDEx2 R软件包是一种简单而强大的工具,可以应用于RNA-seq,16S rRNA基因测序和差异生长数据集,并可以扩展到使用类似方法的其他技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号